NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fast controlled generation from language models with adaptive weighted rejection sampling

Lipkin, Ben; LeBrun, Benjamin; Vigly, Jacob Hoover; Loula, João; MacIver, David R; Du, Li; Eisner, Jason; Cotterell, Ryan; Mansinghka, Vikash; O'Donnell, Timothy J; et al (October 2025, Second conference on language modeling (COLM))

Full Text Available
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts

https://doi.org/10.18653/v1/2021.naacl-main.410

Qin, Guanghui; Eisner, Jason (June 2021, Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT))

Full Text Available
Limitations of Autoregressive Models and Their Alternatives

https://doi.org/10.18653/v1/2021.naacl-main.405

Lin, Chu-Cheng; Jaech, Aaron; Li, Xin; Gormley, Matthew R.; Eisner, Jason (June 2021, Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT))

Full Text Available
Specializing Word Embeddings (for Parsing) by Information Bottleneck

https://doi.org/10.18653/v1/D19-1276

Li, Xiang Lisa; Eisner, Jason (November 2019, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing)

Full Text Available
Spelling-Aware Construction of Macaronic Texts for Teaching Foreign-Language Vocabulary

https://doi.org/10.18653/v1/D19-1679

Renduchintala, Adithya; Koehn, Philipp; Eisner, Jason (November 2019, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing)

Full Text Available
A Generative Model for Punctuation in Dependency Trees

https://doi.org/10.1162/tacl_a_00273

Li, Xiang Lisa; Wang, Dingquan; Eisner, Jason (November 2019, Transactions of the Association for Computational Linguistics)

Treebanks traditionally treat punctuation marks as ordinary words, but linguists have suggested that a tree’s “true” punctuation marks are not observed (Nunberg, 1990). These latent “underlying” marks serve to delimit or separate constituents in the syntax tree. When the tree’s yield is rendered as a written sentence, a string rewriting mechanism transduces the underlying marks into “surface” marks, which are part of the observed (surface) string but should not be regarded as part of the tree. We formalize this idea in a generative model of punctuation that admits efficient dynamic programming. We train it without observing the underlying marks, by locally maximizing the incomplete data likelihood (similarly to the EM algorithm). When we use the trained model to reconstruct the tree’s underlying punctuation, the results appear plausible across 5 languages, and in particular are consistent with Nunberg’s analysis of English. We show that our generative model can be used to beat baselines on punctuation restoration. Also, our reconstruction of a sentence’s underlying punctuation lets us appropriately render the surface punctuation (via our trained underlying-to-surface mechanism) when we syntactically transform the sentence.
more » « less
Full Text Available
Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model

https://doi.org/10.1609/aaai.v33i01.33016843

Mielke, Sebastian J.; Eisner, Jason (July 2019, Proceedings of the AAAI Conference on Artificial Intelligence)

We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP tasks. The method we propose can be used to extend any closedvocabulary generative model, but in this paper we specifically consider the case of neural language modeling. Our Bayesian generative story combines a standard RNN language model (generating the word tokens in each sentence) with an RNNbased spelling model (generating the letters in each word type). These two RNNs respectively capture sentence structure and word structure, and are kept separate as in linguistics. By invoking the second RNN to generate spellings for novel words in context, we obtain an open-vocabulary language model. For known words, embeddings are naturally inferred by combining evidence from type spelling and token context. Comparing to baselines (including a novel strong baseline), we beat previous work and establish state-of-the-art results on multiple datasets.
more » « less
Full Text Available
Simple Construction of Mixed-Language Texts for Vocabulary Learning

https://doi.org/10.18653/v1/W19-4439

Renduchintala, Adithya; Koehn, Philipp; Eisner, Jason (August 2019, Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications (BEA))

Full Text Available
On the Complexity and Typology of Inflectional Morphological Systems

https://doi.org/10.1162/tacl_a_00271

Cotterell, Ryan; Kirov, Christo; Hulden, Mans; Eisner, Jason (November 2019, Transactions of the Association for Computational Linguistics)

We quantify the linguistic complexity of different languages’ morphological systems. We verify that there is a statistically significant empirical trade-off between paradigm size and irregularity: A language’s inflectional paradigms may be either large in size or highly irregular, but never both. We define a new measure of paradigm irregularity based on the conditional entropy of the surface realization of a paradigm— how hard it is to jointly predict all the word forms in a paradigm from the lemma. We estimate irregularity by training a predictive model. Our measurements are taken on large morphological paradigms from 36 typologically diverse languages.
more » « less
Full Text Available
A Corpus for Large-Scale Phonetic Typology

https://doi.org/10.18653/v1/2020.acl-main.415

Salesky, Elizabeth; Chodroff, Eleanor; Pimentel, Tiago; Wiesner, Matthew; Cotterell, Ryan; Black, Alan W; Eisner, Jason (July 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL))

Full Text Available

« Prev Next »

Search for: All records